BUG: Preserve extension dtypes in MultiIndex during concat (#58421) #61211
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.Fix Summary:
Previously, the
_make_concat_multiindexmethod could silently downgrade extension dtypes (e.g., to object) when creating levels. This PR ensures that the_concat_indexeshelper uses the correct dtype-aware construction (array(..., dtype=...)) to preserve the original dtype of the first index.Test added:
Added a test in
pandas/tests/frame/methods/test_concat_arrow_index.pythat covers the preservation of extension dtypes when usingpd.concatwithkeys=that triggers MultiIndex creation.The test creates two DataFrames with
timestamp[pyarrow]indices, then concatenates them withpd.concat(..., keys=...)and asserts that:MultiIndexlevels[1]) retains theArrowDtype('timestamp[us][pyarrow]')instead of being downgraded toobject.This ensures the dtype preservation fix is validated and regressed against.